How Data Types Affect Composite Index Sorting, Comparison & Efficiency
In a composite (multi-column) index, MySQL stores and sorts values in left-to-right order based on the data type of each column. Different data types influence how comparisons are performed, how the B-Tree is organized, and how efficiently the index can be used.
MySQL sorts composite index entries lexicographically (column1 → column2 → column3...).
Numeric types (INT, BIGINT) sort using binary integer comparison — fast and CPU-efficient.
VARCHAR sorts based on collation (case sensitivity, accent rules), making comparisons slower.
DATE and DATETIME sort using chronological numeric ordering, which is efficient.
INT comparisons are exact, fixed-length, and fast since values are stored in binary.
VARCHAR comparisons depend on collation → may require multi-byte comparison and normalization.
DATE values are compared as integers representing YYYYMMDD → consistent and fast.
Mixed-type comparisons (e.g., INT vs VARCHAR literals) cause implicit type casts and reduce index usage.
Composite index efficiency is highest when the most selective, fixed-length, and simple types (like INT) appear first.
VARCHAR columns reduce index efficiency because collation-based comparisons require more CPU.
Using large VARCHAR columns increases index size → fewer index entries per page → more I/O.
DATE columns are compact (3 bytes) and index-friendly.
Index structure: INDEX (user_id INT, status VARCHAR, created_at DATE).
MySQL first groups all rows by user_id (fast binary integer ordering).
Within each user_id, rows are sorted by status using collation rules.
Finally, for each status group, rows are sorted by created_at chronologically.
Index can only be used left-to-right (prefix rule).
Data types with variable-length or collation overhead slow down comparisons.
The order of columns in composite indexes should prioritize: high selectivity → fixed-length → frequently filtered columns.
Implicit type conversion prevents index usage and forces full scans.